* Read data in ASCII format from Climatic Research Unit, East Anglia

*File format: each month has a list columns of longitude and rows of latitude
*So new month after 360*12 rows (=4320)

* Files for temp and precip are by decade
* Downloaded from https://data.ceda.ac.uk/badc/cru/data/cru_ts/cru_ts_4.08/data
* And gzunzip

* Files for scPDSI are all years: https://crudata.uea.ac.uk/cru/data/drought/
* And gzunzip

log using read_ascii_cru, replace
timer on 1


**# Precipitation and temperature in loop by variable and decade

foreach v in tmp pre {

foreach yr in 1991 2001 2011 {

local end=`yr'+9

infile `v'1-`v'720 using "../data/cru/cru_ts4.08.`yr'.`end'.`v'.dat", clear

* replace missing values with true missing
quietly: mvdecode `v'1-`v'720, mv(-999)


gen int y=360-mod((_n-1),360)
gen int year=int((_n-1)/4320)
gen int month=int((_n-1)/360)-year*12+1
replace year=year+`yr'

label var y "Latitude row #"


*create annual averages
collapse `v'1-`v'720, by(year y)

tempfile `v'`yr'
save ``v'`yr''
}

append using ``v'1991' ``v'2001'

*drop observations outside the range we use:
drop if y<31 | y>300

*renumber x to account for dropped polar rows

replace y=y-30
*create one obs per (x,y) pair per year
reshape long `v', i(y year) j(x) favor(speed)

*should be divided by 10 (surprise!)  See: https://crudata.uea.ac.uk/cru/data/hrg/#info
replace `v'=`v'/10


*create deviations based on all years back to 1991 before dropping extraneous years
bysort y x: egen av`v'=mean(`v')
gen dev_`v'=`v'-av`v'
label var dev_`v' "Temperature anomaly" 
label var `v' "Annual average temp (C)"


if "`v'" == "pre" {
rename  pre precip
rename dev_pre dev_precip
label var precip "Average monthly precip (mm)"
label var dev_precip "Precip anomaly"
} 



drop if year<2000 | year>2011

drop av`v'

sort x y year

save "../data/annual_av_`v'.dta", replace
}



**# sc-PDSI
clear
infile pdsi1-pdsi720 using "../data/cru/scPDSI.1901.2016.dat"  in 427681/l

* replace missing values with true missing
quietly: mvdecode pdsi1-pdsi720, mv(-9999)

gen int y=mod((_n-1),360)
gen int year=int((_n-1)/4320)
gen int month=int((_n-1)/360)-year*12+1
replace year=year+2000

*drop observations outside the range we use:
drop if y<31 | y>300

*renumber y to account for dropped polar rows
replace y=y-30

tempfile pdsi
save `pdsi'

*create annual averages and mins
forvalues i = 1(1)720 { 
    qui gen pdsi_min`i'=pdsi`i'
	qui gen pdsi_median`i'=pdsi`i'
    }

collapse (mean) pdsi1-pdsi720 (min) pdsi_min* (median) pdsi_median*, by(year y)
	
reshape long pdsi pdsi_min pdsi_median, i(y year) j(x) favor(speed)

rename pdsi pdsi_av

sort y x year

tempfile pdsi_annual
save `pdsi_annual'

use `pdsi', clear

keep if (month==6 & y<=150) | (month==12 & y>150)

drop month

reshape long pdsi, i(y year) j(x) favor(speed)

rename pdsi pdsi_summer

label var pdsi_summer "scPDSI in summer"

sort y x year

merge 1:1 y x year using `pdsi_annual', keep(match)
drop _merge 

save "../data/pdsi.dta", replace
 
timer off 1
timer list 1

log close
